Evolution Strategies in Optimization Problems* 

Pedro A. F. Cruz Delfim F. M. Torres 

Department of Mathematics 
University of Aveiro 
3810-193 Aveiro, Portugal 

{pedrocruz , delf imjOua . pt 

Abstract 

Evolution Strategies are inspired in biology and part of a larger re- 
search field known as Evolutionary Algorithms. Those strategies perform 
a random search in the space of admissible functions, aiming to optimize 
some given objective function. We show that simple evolution strategies 
are a useful tool in optimal control, permitting to obtain, in an efficient 
way, good approximations to the solutions of some recent and challenging 
optimal control problems. 
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1 Introduction 

Evolution Strategies (ES) are algorithms inspired in biology, with publications 
dating back to 1965 by separate authors H.P. Schwefel and I. Rechenberg (cf. 
[5]). ES are part of a larger area called Evolutionary Algorithms that perform 
a random search in the space of solutions aiming to optimize some objective 
function. It is common to use biological terms to describe these algorithms. Here 
we make use of a simple ES algorithm known as the (/i. A)— ES method [5], where 
jji is the number of progenitors and A is the number of generated approximations, 
called offsprings. Progenitors are recombined and mutated to produce, at each 
generation, A offsprings with innovations sampled from a multivariate normal 
distribution. The variance can also be subject to mutation, meaning that it is 
part of the genetic code of the population. Every solution is evaluated by the 
objective function and one or some of them selected to be the next progenitors, 
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allowing the search to go on, stopping when some criteria is met. In this paper 
we use a recent convergence result proved by A. Auger in 2005 [3]. The log- linear 
convergence is achieved for the optimization problems we investigate here, and 
depends on the number A of search points. 

Usually optimal control problems are approximately solved by means of nu- 
merical techniques based on the gradient vector or the Hessian matrix • Com- 
pared with these techniques, ES provide easier computer coding because they 
only use measures from a discretized objective function. A first work combining 
these two research fields (ES and optimal control) was done by D.S. Szarkowicz 
in 1995 [13], where a Monte Carlo method (an algorithm with the same prin- 
ciple as ES) is used to find an approximation to the classical brachistochrone 
problem. In the late nineties of the XX century, B. Porter and his collaborators 
showed how ES are useful to synthesize optimal control policies that minimize 
manufacturing costs while meeting production schedules [8] . The use of ES in 
Control has grown during the last ten years, and is today an active and promis- 
ing research area. Recent results, showing the power of ES in Control, include 
Hamiltonian synthesis [IT] , robust stabilization [TU] , and optimization [?] . Very 
recently, it has also been shown that the theory of optimal control provides 
insights that permit to develop more efective ES algorithms [T]. 

In this work we are interested in two classical problems of the calculus of 
variations: the 1696 brachistochrone problem and the 1687 Newton's aerody- 
namical problem of minimal resistance (see e.g. |15|). These two problems, 
although classical, are source of strong current research on optimal control and 
provide many interesting and challenging open questions [71 [13] . We focus our 
study on the brachistochrone problem with restrictions proposed by A.G. Ramm 
in 1999 pjj, for which some questions still remain open (see some conjectures 
in |12|): and on a generalized aerodynamical minimum resistance problem with 
non-parallel flux of particles, recently studied by Plakhov and Torres |7l [16] . 
Our results show the effectiveness of ES algorithms for this class of problems 
and motivate further work in this direction in order to find the (yet) unknown 
solutions to some related problems, as the ones formulated in [5]. 

2 Problems and Solutions 

All the problems we are interested in share the same formulation: 



on some specified class of functions, where y{-) must satisfy some given boundary 
conditions {xo,yo) and {xf,yf). 

We consider a simplified (/i. A)— ES algorithm where we put /i = 1, meaning 
that on each generation we keep only one progenitor to generate other candidate 
solutions, and set A = 10 meaning we generate 10 candidate solutions called 
offsprings (this value appear as a reference value in the literature). Also, the 
algorithm uses an individual and constant cr^ variance on each coordinate, which 
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is fixed to a small value related with the desired precision. The number of 
iterations was 100 000 and was tuned for each problem. We got convergence 
in useful time. The simplified (1, 10)— ES algorithm goes as follow: 

1. Set an equal spaced sequence of n points {xq, . . . , Xi, . . . , Xf} where i = 
1, . . . , 71 — 2; Xq and Xf are kept fixed (given boundary conditions); 

2. Generate a randomly piecewise linear function y{-) that approximate the 
solution, defined by a vector y = {yo, . . . , j/i, . . . , ?//}, i — 1, . . . , n — 2; 
transform y in order to satisfy the boundary conditions yo and yj and the 
specific problem restrictions on y, y' or y"; 

3. Do the following steps a fixed number N of times: 

(a) based on y find A new candidate solutions V^, c — 1, . . . , A, where 
each new candidate is produced hy V ^ y + N(0, cr^) where N(0, cr^) 
is a vector of random perturbations from a normal distribution; trans- 
form each to obey boundary conditions yo and yf and other prob- 
lem restrictions on y, y' or y"; 

(b) determine T'^ := T[F'^], c = 1, . . . , A, and choose the new y := Y'^ as 
the one with minimum T"^. 

In each iteration the best solution must be kept because (/i, A)— ES algorithms 
don't keep the best solution from iteration to iteration. 

The next subsections contain a description of the studied problems, respec- 
tive solutions and the approximations found by the described algorithm. 

2.1 The classical brachistochrone problem, 1696 

Problem statement. The brachistochrone problem consists in determining 
the curve of minimum time when a particle starting at a point A — {xo,yo) 
of a vertical plan goes to a point B = {xi,yi) in the same plane under the 
action of the gravity force and with no initial velocity. According to the energy 
conservation law ^mv^ + mgy — mgyo one easily deduce that the time a particle 
needs to reach B starting from point A along curve y{-) is given by 



where y(xo) = yo, ?/(a^i) = Hi, and y € C'^{xo,xi). The minimum to ([T]) is given 
by the famous Cycloid: 



with Oq < 9 < 9i, Oq and 9i the values of 9 in the starting and ending points 
{xo,yo) 8.nd (xi^yi). The minimum time is given by T = y^a/{2g)9i, where 
parameters a and 9i can be determined numerically from boundary conditions. 




(1) 



{ 



X = xo + ^{9 - sin 6*) 
y^yo- f(l-cos0) 
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(a) Continuous line with dots is the piece- (b) Logarithm of iterations vs. logarithmic 
wise approximate solution; the dashed line distance to the minimum value of JTJl 
the optimal solution. 

Figure 1: The brachistochrone problem and approximate solution. 



Results and implementation details. Consider the following three curves 
and the correspondent time a particle needs to go from A to B through them: 

Tf,: The brachistochrone for the problem with (a;o,a;i) = (0,10), (yo,yi) = 
(10,0) has parameters a ~ 5.72917 and Oi = 2.41201; the time is Tt ~ 
1.84421: 



A piecewise linear function with 20 segments shown in fig. 1(a) was found 
by ES; the time is T^s = 1.85013; 

A piecewise linear function with 20 segments defined over the Brachisto- 
chrone; the time is Tq = 1.85075. 



From fig. 1(a) one can see that the piecewise linear solution is made of points 



that are not over the brachistochrone because that is not the best solution for 



piecewise functions. We use a = 0.01 (see appendix for cpu-times). Fig. 1(b) 
shows that a little more than 10 000 iterations are needed to reach a good 
solution for the 20 line segment problem. 

2.2 Brachistochrone problem with restrictions, 1999 

Problem statement. Ramm (1999) [T^] presents a conjecture about a brachis- 
tochrone problem over the set S of convex functions y (with y" {x) > a.e.) and 
< y{x) < yo{x), where j/q is a straight line between A = (0, 1) and B — {b, 0), 
6 > 0. Up to a constant, the functional to be minimized is formulated as in ([T]): 



Jo \/i -y 
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Let P be the line connecting AO and OB, where O = (0, 0); Pbr be the polygonal 
line connecting AC and CB, C = (7r/ 2,0). Then, Tq := r(?/o) = 2\/l + 62, 
Tp := T(P) = 2 + 6, T{Pbr) = \/4+^ + - 7r/2. Let the brachistochrone be 
yiir- The following inequalities, for each y ^ S, hold [T2] : 

1. if < 5 < 4/3 then T(yb^) < r(y) < Tp; 

2. if 4/3 < 6 < 7r/2 then T(j/b^) < T(y) < Tq; 

3. if 6 > 7r/2 then T{Pbr) < T{y) < Tq. 

The classical brachistochrone solution holds for cases 1 and 2 only. For the 
third case, Ramm has conjectured that the minimum time curve is composed 
by the brachistochrone between (0, 1) and (7r/2, 0) and then by the horizontal 
segment between {it/2,0) and (x/,0). 

Results and implementation details. We study the problem with b — 2. 
Our results give force to Ramm's conjecture mentioned above for case 3. We 
compare three descendant times: 

Tbr- The conjectured solution in continuous time takes Thr — a/ a/9.80/ + (6 — 
7r/2)/V2 * 9.8 = 0.8066; 

Tes- The 20 segment piecewise linear solution found by ES needs T^s — 0.8107; 

To'. The 20 segment piecewise linear solution with points over the conjectured 
solution needs To — 0.8111. 

Previous values and fig. [2] permits to take similar conclusions than the ones 
obtained for the pure brachistochrone problem ( §2.1|) . We use a = 0.001 (see 
appendix for cpu-times). Fig. |2(b)| shows that less than 10 000 iterations are 
needed to reach a good solution. 



2.3 Newton's minimum resistance, 1687 

Problem statement. Newton's aerodynamical problem consists in determining 
the minimum resistance profile of a body of revolution moving at constant speed 
in a rare medium of equally spaced particles that don't interact with each other. 
Collisions with the body are assumed to be perfectly elastic. Formulation of this 
problem is: minimize 

where < x < r, y(0) = 0, y{r) = H and y'{x) > 0. The solution is given in 
parametric form: 

x{u) — 2Xu , y{u) = , for w e [0, 1] ; 

X{u) ^ ^{^+2u+U^) , y(u) = ■^(-logM+U^ + ^w"')-^ , for W e [1, Umax] • 

Parameters A and Umax are obtained solving x(umax) — r and t/(umax) = H. 
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(a) Continuous line with dots is the ob- (b) Logarithm of iterations vs. logarithmic 
tained approximated solution; dashed line distance to minimum integral value. 
Ramm's conjectured solution. 

Figure 2: Ramm's conjectured solution and approximate solution. 

Results and implementations details. For H — 2 we have: 

Rnewton- The cxact solution has resistance Rnewton — 0.0802; 

Res- The 20 segment piecewise linear solution found by ES has Res ~ 0.0809; 

Ro'. The 20 segment piecewise linear solution with points over the exact solution 
leads to Ro = 0.0808. 

Newton's problem reveals to be more complex than previously studied brachis- 
tochrone problems. Trial- and-error was needed in order to find a useful 
value. For example, using a — 0.001 our algorithm seems to stop in some local 
minimum. In fig. [3] an approximate solution with cr = 0.01 is shown. We also 
have observed that changing the starting point causes minor differences in the 
approximate solution. The achieved ES solution should be better since Ro is 
better than Res- One possible explanation for this fact is that we are using 20 
Xi fixed points and the optimal solution has a break point at a: = 2A. We use 
a = 0.01 (see appendix for cpu-times). Fig. |3(b)| shows that less than 1000 
iterations are needed to reach a good solution. 

2.4 Newton's problem with temperature, 2005 

Problem statement. The problem consists in determining the body of mini- 
mum resistance, moving with constant velocity in a rarefied medium of chaot- 
ically moving particles with velocity distributions assumed to be radially sym- 
metric in the Euclidian space W^. This problem was posed and solved in 2005- 
2006 by Plakhov and Torres [71 [16]. It turns out that the two-dimensional 
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< log„(ileralion!) 

(a) Continuous line with dots is the obtained (b) Logarithm of iterations vs. logarithmic 
approximation; the dashed line the optimal distance to minimum integral value, 
solution. 

Figure 3: Optimal solution to Newton's problem and approximation. 

problem {d = 2) is more richer than the three-dimensional one, being possible 
five types of solutions when the velocity of the moving body is not 'too slow' or 
'too fast' compared with the velocity of particles. 

The pressure at the body surface is described by two functions: in the front 
of the body the flux of particles causes resistance, in the back the flux causes 
acceleration. We consider functions found in [16], where the two flux functions 
p+ and p- are given by p+{u) — jq^^ + 0.5 and P-{u) = — 0.5. We 

also consider a body of fixed radius 1. The optimal solution depends on the 
body height h: the front solution is denoted by fh^-, which depends on some 
appropriate front height and the solution for the rear is denoted by fh_, 
depending on some appropriate height h-. Optimal solutions fh+ and fh_ are 
obtained: ^ 

=min R+(/,.)= / p+{m)dt 

and ^ 

A_ =minR_(/,)= / p_(/,;(i))di. 

Then, the body shape is determined by minimizing 

R{h)^ min (R+(/;:j+R_(/L)). 

Solution can be of five types (d = 2). From functions and p_ one can 
determine constants u'^, m*, and h-. Then, depending on the choice of the 
height h, theory developed in [Zl[16] asserts that the minimum resistance body 
is: 
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(a) Continuous line with dots is the obtained (b) Logarithm of iterations vs. logarithmic 
approximation; dashed line the optimum. distance to minimum integral value. 

Figure 4: 2D Newton-type problem with temperature. 

1. a trapezium if < /i < 

2. an isosceles triangle if < /i < u*; 

3. the union of a triangle and a trapezium if u.^, < h < u^, + u'^; 

4. if /i > + the solution depends on /i_ and can be a union of two 
isosceles triangles with common base with heights /i^ and /i_ or the union 
of two isosceles triangles and a trapezium; 

5. a combination of a triangle, trapezium and other triangle, depending on 
some other particular conditions (cf. [7]). 

Results and implementation details. We illustrate the use of ES algo- 
rithms for h = 2. Following section 4.1 of we have ~ 1.60847 and u'^ = 1, 
so this is case 3 above: < < + u'L. The resistance values are: 

Rpd- The exact solution has resistance Rpd — 0.681; 

Res- The 31 segment piecewise linear solution found by ES has Res = 0.685; 

Similar to the classical problem of Newton ( ij2.3[) . some hand search for the 
parameter cr^ was needed. We use a = 0.01 and piecewise approximation with 
31 equal spaced segments in xx (see appendix for cpu-times). Fig. |3(b)] shows 
that only httle more than 1 000 iterations are needed to reach a good solution. 
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3 Conclusions and future directions 



Our main conclusion is that a simple ES algorithm can be effectively used as a 
tool to find approximate solutions to some optimization problems. In the present 
work we report simulations that motivate the use of ES algorithms to find good 
approximate solutions to brachistochrone-type and Newton-type problems. We 
illustrate our approach with the classical problems and with some recent and still 
challenging problems. More precisely, we considered the 1696 brachistochrone 
problem (B); the 1687 Newton's aerodynamical problem of minimal resistance 
(N); a recent brachistochrone problem with restrictions (R) studied by Ramm in 
1999, and where some open questions still remain [12j : and finally a generalized 
aerodynamical minimum resistance problem with non-parallel flux of particles 
(P), recently studied by Plakhov and Torres [71 [TB] and which gives rise to other 
interesting questions [6^ • 

We argue that the approximated solutions we have found by the ES algo- 
rithm are of good quality. We give two reasons. First, for the Brachistochrone 
and Ramm's problems the functional value for the ES approximation was bet- 
ter than the linear interpolation over the exact solution, showing that ES algo- 
rithm is capable of a good precision. The second reason is the low relative error 
riTy, Ty) between the functional over the exact solution Ty and the approximate 
solution Ty, as shown in the following table: 



Pr. 


maxllfc - yk\ 


riTy.Ty) 


Pr. 


max|yfc - yk\ 


r{TY.Ty) 


(B) 


0.15 


0.001 


(N) 


0.08 


0.01 


(R) 


0.09 


0.003 


(P) 


0.07 


0.001 



where yk are points over the exact solution of the problem and Yk are points 
from the piecewise approximation. We note that max|Yfe — yk\ need not to be 
zero because the best continuous solution and the best linear solution cannot 
be superposed. 

ES algorithms use computers in an intensive way. For brachistochrone-type 
and Newton-type problems, and nowadays computing power, few minutes of 
simulation (or less) were enough on an interpreted language (see appendix). 

More research is needed to tune this kind of algorithms and obtain more 
accurate solutions. Special attention must be put in qualifying an obtained 
ES approximation: Is it a minimum of the energy function? Is it local or 
global? Another question is computer efficiency. Waiting few minutes in recent 
computers is not bad, but can we improve the running times? 

Concerning the accuracy, several new ES algorithms have been proposed. 
These algorithms can tune a values and use generated second order information 
that can influence the precision and time needed. Also the use of random xx 
points (besides y piecewise linear solution) should be investigated. 

We believe that the simplicity of the technique considered in the present 
work can help in the search of solutions to some open problems in optimal 
control. This is under investigation and will be addressed elsewhere. 
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Appendix — hardware and software 



The code developed for this work can be freely obtained from the first author's 
web page, at http://www.mat.ua.pt/jpedro/evolution/. 

In most of our investigations few minutes were sufRcient for getting a good 
approximation for all the considered problems, even using a code style prone to 
humans rather than machines (code was done concerning clearness of concepts 
rather than execution speed). Our simulations used a Pentium 4 CPU 3 GHz, 
running Debian Linux http://www.debian.org. The language was R [9], cho- 
sen because it is a fast interpreted language, numerically oriented to statistics 
and freely available. 

The following CPU-times were obtained with command 

time R CMD BATCH problem. R 

where time keeps track of cpu used and R calls the interpreter. The times are 
rounded and the last column estimates the time for a first good solution: 



Problem 


Section 


100 000 iterations 


'Good solution' 


Brachistochronc 


832.11 


10 min 


1 min 


Ramm conjecture 




10 min 


1 min 


Newton 




9 min 


10 sec 


Plakhov & Torres 




14 min 


10 sec 



We note that the per iteration 'step' was a ~ 0.001 in the brachistochronc (- 
type) problems and tr — 0.01 for the Newton(-type) problems. Using a compiled 
language like C one can certainly improve times by several orders of magnitude. 
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